A measure theoretic approach to information retrieval

نویسندگان

  • Sándor Dominich
  • Tamás Kiezer
چکیده

The vector space model of information retrieval is one of the classical and widely applied retrieval models. Paradoxically, it has been characterised by a discrepancy between its formal framework and implementable form. The underlying concepts of the vector space model are mathematical terms: linear space, vector, and inner product. However, in the vector space model, the mathematical meaning of these concepts is not preserved. They are used as mere computational constructs or metaphors. Thus, the vector space model actually does not follow logically from the mathematical concepts on which it has been claimed to rest. This problem has been recognised for more than two decades, but no proper solution has emerged so far. The present paper proposes just such a solution to this very problem. Firstly, the concept of retrieval is defined based on measure theory. Then, retrieval is particularised using fuzzy set theory. As a result, the retrieval function is conceived as the cardinality of the intersection of two fuzzy sets. This view makes it possible to build a connection to linear spaces. Thus, the classical and the generalised vector space models as well as the latent semantic indexing model gain a correct formal background with which they are consistent. At the same time it becomes clear that the inner product is not a necessary ingredient of the vector space model. Moreover, this view makes it possible to consistently formulate new retrieval methods: in linear space with general basis, entropy-based, and probability-based. Experimental results using standard test collections are also reported.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

Combination of real options and game-theoretic approach in investment analysis

Investments in technology create a large amount of capital investments by major companies. Assessing such investment projects is identified as critical to the efficient assignment of resources. Viewing investment projects as real options, this paper expands a method for assessing technology investment decisions in the linkage existence of uncertainty and competition. It combines the game-theore...

متن کامل

A Game Theoretic Approach for Sustainable Power Systems Planning in Transition

Intensified industrialization in developing countries has recently resulted in huge electric power demand growth; however, electricity generation in these countries is still heavily reliant on inefficient and traditional non-renewable technologies. In this paper, we develop an integrated game-theoretic model for effective power systems planning thorough balancing between supply and demand for e...

متن کامل

Coordinating a decentralized supply chain with a stochastic demand using quantity flexibility contract: a game-theoretic approach

  Supply chain includes two or more parties linked by flow of goods, information, and funds. In a decentralized system, supply chain members make decision regardless of their decision's effects on the performance of the other members and the entire supply chain. This is the key issue in supply chain management, that the mechanism should be developed in which different objectives should be align...

متن کامل

Image Retrieval Using Mutual Information Hou

In this paper, we study an information theoretic approach to image similarity measurement for content-base image retrieval. In this novel scheme, similarities are measured by the amount of information the images contained about one another – mutual information (MI). The given approach is based on the premise that two similar images should have high mutual information, or equivalently, the query...

متن کامل

A game Theoretic Approach to Pricing, Advertising and Collection Decisions adjustment in a closed-loop supply chain

This paper considers advertising, collection and pricing decisions simultaneously for a closed-loop supplychain(CLSC) with one manufacturer(he) and two retailers(she). A multiplicatively separable new demand function is proposed which influenced by pricing and advertising. In this paper, three well-known scenarios in the game theory including the Nash, Stackelberg and Cooperative games are expl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 58  شماره 

صفحات  -

تاریخ انتشار 2007